A Spoken Afrikaans Language Resource Designed for Research on Pronunciation Variations
نویسندگان
چکیده
In this contribution, the design, collection, annotation and planned distribution of a new spoken language resource of Afrikaans (SALAR) is discussed. The corpus contains speech of mother tongue speakers of Afrikaans, and is intended to become a primary national language resource for phonetic research and research on pronunciation variations. As such, the corpus is designed to expose pronunciation variations due to regional accents, speech rate (normal and fast speech) and speech mode (read and spontaneous speech). The corpus is collected by the Potchefstroom Campus of the North-West University, but in all phases of the corpus creation process there was a close collaboration with ELIS-UG (Belgium), one of the institutions that has been engaged in the creation of the Spoken Dutch Corpus (CGN).
منابع مشابه
MAT: a tool for L2 pronunciation errors annotation
In the area of Computer Assisted Language Learning(CALL), second language (L2) learners’ spoken data is an important resource for analysing and annotating typical L2 pronunciation errors. The annotation of L2 pronunciation errors in spoken data is not an easy task though, normally it requires manual annotation from trained linguists or phoneticians. In order to facilitate this task, in this pap...
متن کاملOn the Usefulness of Large Spoken Language Corpora for Linguistic Research
In the past, fundamental linguistic research was typically conducted on small data sets that were handcrafted for the specific research at hand. However, from the eighties onwards, many large spoken language corpora have become available. This study investigates the usefulness of large multi-purpose spoken language corpora for fundamental linguistic research. A research task was designed in whi...
متن کاملVerifying pronunciation dictionaries using conflict analysis
We describe a new language-independent technique for automatically identifying errors in an electronic pronunciation dictionary by analyzing the source of conflicting patterns directly. We evaluate the effectiveness of the technique in two ways: we perform a controlled experiment using artificially corrupted data (allowing us to measure precision and recall exactly); and then apply the techniqu...
متن کاملPronunciation variations of Spanish-accented English spoken by young children
When learning to speak English, non-native speakers may pronounce some English phonemes differently from native speakers. These pronunciation variations can degrade an automatic speech recognition system’s performance on accented English. This paper is a first attempt to find common pronunciation variations in Spanishaccented English as spoken by young children. The analysis of pronunciation va...
متن کاملDISCO: development and integration of speech technology into courseware for language learning
Recent research has shown that a properly designed ASRbased CALL system (Dutch-CAPT) was capable of detecting pronunciation errors and of providing comprehensible feedback on pronunciation. Since pronunciation is not the only skill required for speaking a second language, we explored the possibility of extending the Dutch-CAPT approach to other aspects of speaking proficiency like morphology an...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004